Data lake について

Words near each other

・ Data stream
・ Data stream clustering
・ Data Stream Interface
・ Data stream management system
・ Data Integrity Field
・ Data Intercept Technology Unit
・ Data Interchange Format
・ Data Interchange Standards Association
・ Data island
・ Data item
・ Data item descriptions
・ Data jam
・ Data janitor
・ Data journalism
・ Data Kakus
・ Data lake
・ Data Language Interface
・ Data library
・ Data LifeSaver
・ Data lineage
・ Data link
・ Data link (disambiguation)
・ Data link connection identifier
・ Data link connector (automotive)
・ Data Link Control
・ Data link layer
・ Data Link Provider Interface
・ Data Link Solutions
・ Data literacy
・ Data Loading and Analysis System

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Data lake ：ウィキペディア英語版

Data lake
A data lake is a large storage repository and processing engine. They provide "massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs".〔 The term was coined by James Dixon, Pentaho chief technology officer.〔 Dixon used the term initially to contrast with "data mart", which is a smaller repository of interesting attributes extracted from the raw data. He wrote: "If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples." 〔 Dixon argued that data marts have several inherent problems, and that data lakes are the optimal solution.
Dixon identified 2 shortcomings of data marts: "Only a subset of the attributes are examined, so only pre-determined questions can be answered." and "The data is aggregated so visibility into the lowest levels is lost." 〔 These problems are often referred to as "siloing" and, in agreement with Dixon, PricewaterhouseCoopers says that data lakes could "put an end to data silos".〔 In their study on data lakes they note that "Enterprises across industries are starting to extract and place data for analytics into a single, Hadoop based repository." They note organizations such as UC Irvine Medical Center, Google and Facebook who have embraced the data lake concept.
PricewaterhouseCooper also claim that cost is a major reason that organizations adopt data lakes they state that:
== Examples of data lakes ==
Currently the only viable example of a data lake is Apache Hadoop.
Many companies also use cloud storage services such as Amazon S3 along with other open source tools Docker as a data lake.〔 There is also an academic interest in the concept of data lakes for example (Personal DataLake )〔http://ieeexplore.ieee.org/xpl/abstractAuthors.jsp?reload=true&arnumber=7310733〕 an ongoing project at Cardiff University to create a new type of data lake to manage the big data of individual users and provide a single point for collecting, organizing, and sharing personal data.〔http://www.researchgate.net/publication/283053696_Personal_Data_Lake_With_Data_Gravity_Pull〕

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Data lake」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース